Moving Large Data Sets Over High-Performance Long Distance Networks

نویسندگان

  • Stephen W. Hodson
  • Stephen W. Poole
  • Thomas M. Ruwart
  • Bradley W. Settlemyer
چکیده

In this project we look at the performance characteristics of three tools used to move large data sets over dedicated long distance networking infrastructure. Although performance studies of wide area networks have been a frequent topic of interest, performance analyses have tended to focus on network latency characteristics and peak throughput using network traffic generators. In this study we instead perform an end-to-end long distance networking analysis that includes reading large data sets from a source file system and committing large data sets to a destination file system. An evaluation of end-to-end data movement is also an evaluation of the system configurations employed and the tools used to move the data. For this paper, we have built several storage platforms and connected them with a high performance long distance network configuration. We use these systems to analyze the capabilities of three data movement tools: BBcp, GridFTP, and XDD. Our studies demonstrate that existing data movement tools do not provide efficient performance levels or exercise the storage devices in their highest performance modes. We describe the device information required to achieve high levels of I/O performance and discuss how this data is applicable in use cases beyond data movement performance.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Using a Fuzzy Auto Regressive Integrated Moving Average Model for Exchange Rate Forecasting

Forecasting models have wide applications in decision making. In the real world, rapid changes normally take place in different areas, specifically in financial markets. Collecting the required data is a main problem for forecasters in such unstable environments. Forecasting methods such as Auto Regressive Integrated Moving Average (ARIMA) models and also Artificial Neural Networks (ANNs) need ...

متن کامل

Using a Fuzzy Auto Regressive Integrated Moving Average Model for Exchange Rate Forecasting

Forecasting models have wide applications in decision making. In the real world, rapid changes normally take place in different areas, specifically in financial markets. Collecting the required data is a main problem for forecasters in such unstable environments. Forecasting methods such as Auto Regressive Integrated Moving Average (ARIMA) models and also Artificial Neural Networks (ANNs) need ...

متن کامل

A Hybrid Time Series Clustering Method Based on Fuzzy C-Means Algorithm: An Agreement Based Clustering Approach

In recent years, the advancement of information gathering technologies such as GPS and GSM networks have led to huge complex datasets such as time series and trajectories. As a result it is essential to use appropriate methods to analyze the produced large raw datasets. Extracting useful information from large data sets has always been one of the most important challenges in different sciences,...

متن کامل

Feature Selection for Small Sample Sets with High Dimensional Data Using Heuristic Hybrid Approach

Feature selection can significantly be decisive when analyzing high dimensional data, especially with a small number of samples. Feature extraction methods do not have decent performance in these conditions. With small sample sets and high dimensional data, exploring a large search space and learning from insufficient samples becomes extremely hard. As a result, neural networks and clustering a...

متن کامل

Communication issues within high performance computing grids

This paper presents several ideas pertaining to desirable properties associated with protocols for moving large amounts of data over a grid of high performance computers. The protocol enhancements are discussed in terms of their scalability and identifying sources of potential protocol redundancy. In the context of this paper it is scalability as it relates to the transport of increasingly larg...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2011